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Abstract The burst in the use of online social networks over the last decade 
has provided evidence that current rumor spreading models miss some funda- 
mental ingredients in order to reproduce how information is disseminated. In 
particular, recent literature has revealed that these models fail to reproduce 
the fact that some nodes in a network have an influential role when it comes 
to spread a piece of information. In this work, we introduce two mechanisms 
with the aim of filling the gap between theoretical and experimental results. 
The first model introduces the assumption that spreaders are not always ac- 
tive whereas the second model considers the possibility that an ignorant is 
not interested in spreading the rumor. In both cases, results from numerical 
simulations show a higher adhesion to real data than classical rumor spread- 
ing models. Our results shed some light on the mechanisms underlying the 
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spreading of information and ideas in large social systems and pave the way 
for more realistic diffusion models. 

Keywords Rumor spreading • Online social networks • Human activity 
patterns 



1 Introduction 

Understanding the way in which a disease or a piece of information spreads 
from person to person is of obvious practical relevance. If we are able to com- 
prehend the mechanisms that dominate such spreading processes we would 
be able to enhance the spread of valuable information through a community 
or impair an outbreak of an infectious disease. The similitudes between these 
two processes, epidemics and information diffusion (also referred to as rumor 
spreading (TMTT1 IT2, 16, 19 ) have long been recognized and the two hclds have 
evolved in parallel freely borrowing ideas and concepts from each other [5J 

EPJ. 

In epidemics it has become clear that some individuals that are "super- 
spreaders" play a dominant role in the course of an epidemic [50] . Intuitively, 
one expects that similarly influential individuals would also be present in 
the case of information diffusion and recent years have witnessed a growing 
interest to understand how to identify them [3,6, 20,24,1]- Successful ap- 
proaches have focused on studying the effect that different network-based 
centrality measures have on rumor spreading. In particular, one recent sem- 
inal approach [T7] has identified the fc-core as the best measure to predict 
influence, outperforming degree centrality or betweenness in the context of 
an epidemic spreading process. This insight has been followed by many other 
works, which mainly discuss under which circumstances the fc-core actually 
predicts a node's disease spreading capabilities [7] or propose alternative 
measures of influence [81IT8]. 

Following the original proposal of Kitsak et al [T7] , Borge-Holthocfer and 
Moreno [1] studied rumor spreading dynamics to learn whether the fc-core 
could predict authority or not. Surprisingly, their results indicate that a ru- 
mor's success — measured as the number of individuals that learn about the 
rumor at the end of the spreading dynamics — is topology-independent: no 
matter who in a network triggers the rumor, the final number of nodes who 
learn about it will be the same (given the same spreading parameters). Ad- 
ditionally, central nodes (those at the highest core levels) behave as firewalls, 
short-circuiting the capacity of the rumor to spread further. This theoret- 
ical prediction is clearly at odds with empirical evidence and points to a 
shortcoming of theoretical models that must be overcome. 

The development of the Web 2.0 and the growing popularity of online 
social networks have not only had a tremendous impact on our daily lives, 
but they also had the beneficial consequence of generating detailed data 
on social communication patterns, which can ultimately inspire and guide 
the development of more realistic models. In this paper we try to fill the 
gap between observations from real systems and theoretical predictions by 
introducing some simple modifications to models proposed previously |21l 
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[22] . The resulting models are able to better approximate the behavior of 
users as observed in online social networks, in particular, the fact that there 
are influential nodes with larger diffusion capacities, an important feature 
not accounted for with current rumor spreading models. 

Our analysis starts from the simple empirical observation [21123115] that 
individuals display complex activity patterns both on and offline and, in 
particular, are not active around the clock. This fact has two possible in- 
terpretations. On one hand, users that are actually spreading the rumor are 
active only at specific times and only then they are able to participate in 
the diffusion process. On the other hand, an individual's choice of becoming 
active and participating in a specific information cascade can be seen as a 
demonstration of interest in the topic and his/her will to spread it. Inspired 
by these two interpretations, we derive two different rumor diffusion models. 

The first model incorporates the differences in the activity of the indi- 
viduals responsible for the spreading of the rumor. Each spreader is assigned 
with a randomly chosen probability of being active at a given time. In this 
context, we study the effects of the heterogeneity [33J in the activation proba- 
bility extracting values from three different probability distributions ranging 
from a uniform to a long-tailed one. In a more realistic version, following the 
idea that more active users, usually, also have a central role in the topology 
of the network, we relate the activity of each individual with its degree. 

The second model takes into account the fact that an individual could 
learn the rumor without actually spreading it further. This is for example 
what happens in most online social networks, in which followers receive pieces 
of information from those they are following and not always — indeed, rarely 
— they transmit the news further. We therefore introduce the possibility that 
a person that comes into contact with the rumor does not spread it anymore. 
This approach is complementary to the previous one, as we consider that it 
is the ignorants and not the spreaders who can be inactive. 

In the remaining of the paper, we will show that even though these al- 
ternatives introduce only small and intuitive changes, they are able to shed 
light on the complex social mechanisms at work in real social systems and, at 
least qualitatively, reproduce the heterogeneities observed in Twitter data. 
The rest of the manuscript is organized as follows: In the next section, we 
present a general framework for rumor spreading on networks while subsec- 
tions Hj] and [22] present the two modified models and the results of numerical 
simulations. Finally, we draw our conclusions in section [3] 



2 General modeling framework 

In classical rumor spreading models on networks, each of the N nodes of a 
network can be in one of three possible states. A node holding a rumor and 
willing to transmit it is called a spreader. Nodes that are unaware of the up- 
date will be referred to as ignorants, while those that already know it but are 
not willing to spread it further are called stiflers. We denote the density of 
ignorants, spreaders, and stiflers at time t as i (i), p (t) and r (t), respectively, 
with i(t) + p (t) + r (t) = 1, Mt. The spreading process takes place along the 
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Fig. 1 (color online) Density of stiflers at the end of the rumor spreading raa(ks) 
originated in a node that is part of the ks k-core for the general model of rumor 
spreading (squares) and the empirical fraction, N c /N, of users reached by cascades 
originated at no des with k-core ks (circles) as extracted by real Twitter data (for 
details see [4"lll5]). Numerical simulations were ran using the same empirical Twitter 
Follower network. 



links connecting spreaders and ignorants. At each time step, spreaders con- 
tact all of their neighboring nodes. In the simplest case, whenever a spreader 
j contacts a node n that is ignorant, the latter will become a spreader with 
a fixed probability A. Otherwise, if n is already a spreader, the node j will 
turn into a stifler with probability a. Mathematically, the general model can 
be represented as: 




where the initial conditions are set such that i (0) = 1 — 1/N, p(0) = 1/N 
and r (0) = 0. In addition, and without loss of generality, we set A = 1 unless 
other values are explicitly stated. 

For each alternative model presented, extensive numerical calculations 
have been carried out by simulating the dynamics of rumor propagation on 
top of a real- world Twitter following/follower network |15j . From an initial 
scenario, in which all nodes belong to the ignorants class except the seed, we 
perform S — 10 simulations. This is repeated for each node, i.e. every vertex 
of a network of N nodes acts as the initial seed S times, to obtain statistically 
significant results. In this way, for each node i, we average the final density of 
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stifLcrs in the network . This quantity accounts for the spreading capacity 
of node i, which quantifies how deep the rumor penetrated the network when 
node i was the initial seed: 



where r^™ represents the final density of stiflers for a particular run m with 
origin at node i. With this information at hand for all nodes, we coarse-grain 
the individual r^'s into classes of nodes according to their core number. 
Thus, r 00 (ks) represents the average stifler density for all runs with a seed 
with a ks core index: 



where Tk s is the set of all Nk s nodes with ks values. 

Figure [1] shows the comparison between the values of r M (ks) as obtained 
via the numerical simulations of the above rumor spreading model and the 
observed fraction of users N c /N reached by cascades originated at nodes with 
core index ks obtained by analyzing Twitter usage data from the Spanish 
Indignados movement (see [4lll5j for details on how data have been extracted 
and analyzed) . The differences observed in this plot are striking. Even though 
the model ran on the exact same network, the theoretical prediction is com- 
pletely insensitive to the value of the originating fc-core, while in the empirical 
data there is a clear correlation between belonging to higher cores and larger 
numbers of nodes reached by the cascade. This difference in behavior clearly 
shows that there is something fundamentally lacking in the theoretical model. 

2.1 Model I: Human activity and temporal patterns 

We next consider the possibility that nodes are not always available to take 
part in a certain communication exchange. Each individual is active with a 
certain probability, a*, affecting his/her behavior as a spreader. Thus, on top 
of the constraints of the basic framework presented above, we assume that 
a spreader only attempts to spread the rumor when it is active. As a con- 
sequence, the transition from the class of ignorants to the class of spreaders 
happens less often. 

It is worth mentioning that as far as our model is concerned, the approach 
adopted is rooted in the observation that human activity patterns are mostly 
heterogeneous and therefore individuals are not always active [53] nor is their 
activity distributed randomly over time [2"l ll4[[2"5] . However, we assume that 
nodes in the network still have memory of who their potential neighbors are, 
and although not all the links of a given node were concurrently active, the 
set of available neighbors would be predefined by the underlying static (ag- 
gregated) topology. A more accurate description would require to consider 
that the topology is shaped by the activity of the nodes, so that the re- 
sulting time- varying networks are activity-driven |23j . In the latter case, the 
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Fig. 2 (color online) Density of stiflers at the end of the rumor spreading roo(fcs) 
originated in a node of k-core ks in the activation probability model. Three different 
probability distribution functions are used: a power-law with exponent 7 = 2.5 
(upper panel), exponential with a c = 0.1 (middle panel) and a uniform distribution 
(lower panel). 



interactions between the different classes of nodes in the system would still 
be activity- driven, but no memory of the static topology would be present, 
as the interaction structure is redefined at each time step. Whether or not 
both mechanisms lead to similar behavior is a matter that deserves further 
investigation. 

On the other hand, note that being active or not has no effect on the 
rumor's recipients (ignorants). This mechanism is specific for asynchronous 
communication systems such as Twitter, FedEx, email or SMS where infor- 
mation can be sent even without requiring the collaboration of the recipient. 
On the contrary, for synchronous systems, such as phone calls or Instant 
Messaging, that require both the source and the target of a message to be 
active at the same time, such a scheme would not suffice. 

Here we explore three possibilities for the activity distribution: (a) uni- 
form, P (a) ~ c; (b) exponential, P (a) ~ e~ a l a " \ and (c) power-law, P (a) ~ 
a~ 7 . Interestingly, these distributions yield completely different results. Fig- 
ure [5] illustrates this perfectly. The increase of heterogeneity in activity pat- 
terns moves the distribution of outbreak sizes, (ks), closer to empirical 
results, highlighting the fact that heterogeneity is a fundamental factor in 
real information spreading processes. A uniform activity distribution (lowest 
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panel) completely flattens the spreading capabilities, no matter if nodes are 
in a topologically relevant region or not. This is in good agreement with [3], 
the only difference being the time the system needs to reach a final state 
(that is, the probabilities delay the process significantly). An exponential 
distribution introduces some amount of asymmetries in the activity distri- 
bution, which slightly affects the spreading results (central panel) . Finally, a 
power-law probability distribution introduces heterogeneity in the spreading 
success, the higher kg, the higher the spreading capacity, just as it has been 
found empirically [5lfT5]. 

The importance of the spreader-to-stifler rate is revealed in the heteroge- 
neous scenario, a sets the timescale relevant to this process. For high values 
of a, p nodes quickly become stiflers and the rumor doesn't have the possi- 
bility of reaching a significant fraction of the population while lower values of 
a easily allow for successful dissemination. Furthermore, it should be noted 
that we are assigning activity levels entirely at random, without any relation 
between topological features and activity probabilities. This means that a 
poorly connected node is just as likely to be highly active as a node with 
high degree. However, it has been seen previously |23j that activity distribu- 
tions are correlated with the observed degree distribution. The simplest form 
of effectively implementing this correlation is to assign to node i an activity 
probability a* = h/k max . 

Figure [3] illustrates the results of this scenario. The great heterogeneity of 
the degree distribution is clearly reflected. Rumors triggered from low degree 
nodes (which necessarily have low ks) die out soon, because the nodes they 
reach are almost never active. On the contrary, high degree nodes (which 
are more likely to belong to a high k core) persistently forward messages, 
turning any rumor into system- wide knowledge. Note that spreading is almost 
identical regardless a, in stark contrast to the upper panel of Figure (where 
a determines the shape of spreading) possibly indicating that higher level 
correlations also play an important role. 

Figure 0] shows the comparison between the topology-dependent and ran- 
dom distribution of the activation probabilities. In this case, all the curves 
have been obtained with a power-law activity distribution. However, in one of 
them (blue diamonds) the activity of each node is proportional to its degree 
[en = ki/kmax), whereas in the other curves activation probabilities are as- 
signed at random and thus independently from the topological features of the 
nodes. As in Fig. [5] in the randomly distributed case the spreading is highly 
affected by a meanwhile for the degree-dependent activation probabilities a 
substantial independence from a is present. 

2.2 Model II: Apathy 

The analysis of real data from online social networks demonstrates that most 
of the time users do not react to received messages [5] . One possible interpre- 
tation for this is that they have been informed of a rumor but chose not to 
spread it. This interpretation suggests another ingredient that might be miss- 
ing from classical rumor spreading models and that might help bring them 
closer to reality: the possibility that an ignorant is apathetic and directly 
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Fig. 3 (color online) Fraction of stiflers at the end of the rumor spreading roo(fcs) 
originated in a node of k-core ks when activation probabilities are proportional to 
nodes degree. Three different values of a are use d, the underlying network is the 
empirical Twitter follower network from Ref. [15] , 



goes to the stifler status and does not participate further in the spreading 
dynamics. As noted before, this kind of behavior is common in online so- 
cial networks like Twitter, in which one receives messages that are rarely 
spread further. We incorporate this new element by introducing the proba- 
bility, p, that an ignorant is interested in the topic and decides to diffuse it. 
In this scenario, when a spreader contacts an ignorant, the latter turns into 
a spreader with probability Xp and into a stifler with probability (1 — p) A. 
The transitions allowed by our model are then: 



I^S 
S^R 

A(l 



R 



It should be noted that Model II is a natural counterpart of Model I pre- 
sented in section I2~T1 in the sense that it also assigns activity probabilities 
to each node. The main difference is that this uniform probability p is as- 
signed to ignorant individuals and determines whether or not they choose 
to participate in the spreading process. A parallel can also be made to the 
case of epidemic spreading where a person become immune to a disease upon 
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Fig. 4 (color online) Fraction of stiflers at the end of the rumor spreading roo(fcs) 
originated in a node of k-core ks when the activation probability is proportional to 
nodes degree (diamonds) or randomly distributed (squares, triangles and circles). 
Network topology and the other parameters are the same as in Fig. [3] 



coming in contact with pathogen and before it is able to develop symptoms 
or spread it further. 

The behavior of the system can be better understood analytically by 
writing the mean-field rate equations governing its time evolution in the 
homogeneous mixing approximation: 



dl ® = -Xpkp (t) % (t) - (1 - p) Xkp (t) i (t) , (3) 



dt 
dp® 
dt 

dr (t) 



Xpkp (t) i (t) - akp (t) (p (t) + r (t)) , (4) 



akp (t) (p (t) + r (t)) + (1 -p) Xkp (t) i (t) , (5) 



with the initial conditions i (0) = 1 — l/N, p (0) = 1/N, r (0) = and where 
k represents the number of contacts each spreader has per unit time. The 
first term in the right side of Eq. [3] accounts for the density of ignorants that 
turn into spreaders after an interaction whereas the second term model the 
ignorant to stifler transition with probability (1 — p) X. 
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Fig. 5 (color online) Fraction of stiflers at the end of the rumor spreading 
for different values of p in comparison with the theoretical prediction of Eq. [6] 
Numerical results are the average over 10 3 stochastic runs. In both cases A = 1.0, 
a = 0.5 and k = 4. 



Recalling that i (i) + p (t) + r (t) = 1 we can study the system of Eqs. [3][5] 
analytically in the infinite-time limit p(oo) = 0, obtaining: 

r 00 = l-e-( 1 +^~. (6) 

The average total stifler density r^, for various values of p, obtained by 
numerically solving this transcendental equation is shown in Figure [S] We 
also performed a scries of Monte-Carlo (MC) simulations in the homogenous 
mixing limit. At t = the entire population is ignorant with only a small 
fraction (~ 1/iV) being spreaders. At each time step, each spreader contacts 
k individuals chosen at random from the entire population. If the chosen indi- 
vidual is an ignorant it will become a spreader with probability Xp or directly 
move to stifler status with (1 — p) A. Otherwise, when a spreader comes in 
contact with a stifler or another spreader it turns into a stifler with probabil- 
ity a. When the spreading process reaches the absorbing state p(t) =0 the 
final density of stiflers is recorded. The simulation results are also plotted in 
Figure [S] for comparison with the analytical solution. The agreement between 
the two approaches is striking and serves as a confirmation that we are not 
missing any fundamental ingredients in our analyses. 

Although the addition of a constant p parameter (any node is assigned the 
same p) is a crude approximation to the interest that ignorants might have 
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Fig. 6 (color online) Fraction of stiflers at the end of the rumor spreading roo(fcs) 
originated in a node of k-core ks in the ignorant-to-stifler transition model for 
different p values. For each value of p both the average and the maximum value 
of numerical simulations is presented. Network topology and the other parameters 
are the same as in Fig. [3] 



in becoming spreaders, it has profound implications for the system dynamics 
when compared to the standard setup. Figure [6] shows the behavior of the 
system with the inclusion of the new rule, with fixed A = 1, a = 0.5 for 
different p values. As for the power-law activity distribution in model I and 
in real data [5"1 [T5"1IIZ5] a strong correlation between the k-core of the seed and 
the final outcome of the spreading is observed. In particular, and although we 
have made no efforts toward fitting this value, it is clear that for a very low 
probability (p = 10 -3 ) we already have a close-to-real behavior. Although 
this value might seem low (only one in a thousand contacted individuals do 
forward the rumor), one must consider that this is the probability that one 
individual will choose to participate in any of rumors he observes. It is well 
known that most Twitter users commonly follow on the order of hundreds of 
other individuals so that the number of pieces of content they are exposed to 
daily can easily be on the order of thousands or tens of thousands of which 
they are only able, or willing, to participate in a few. 
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3 Conclusions 

Online social networks arc becoming increasingly central in our lives as they 
come to permeate our daily activity. It then comes as no surprise that they 
have been welcome by mass social movements around the world as unique 
platforms for the diffusion of new ideas and even for the coordination of large 
numbers of individuals. Understanding the forces that drive the behavior of 
individuals interacting in these networks is then one of the great challenges 
for science in the next years. 

One interesting aspect is how ideas are shared between individuals and 
what are the conditions that allow for a large dissemination of them. In this 
context several works studied how a rumor can spread in a population of 
ignorant individuals but, due to the changes in the way in which these tools 
allow us to communicate, most of those works cannot catch the details of 
rumor dynamics on such large scale social systems. 

Driven by data from a microblogging online platform we propose two 
modifications to classical rumor spreading models that are able to qualita- 
tively reproduce the observed differences in the number of individuals reached 
by the rumor when the seed is located in the most connected circles of the 
network or in its periphery. The models we present are based on the obser- 
vation that individuals, both spreaders and ignorants, are not always active 
in the network. Each model then implements a different effective mechanism 
that is consistent with this fact: Model I assigns activity probabilities to each 
node and allows for spreading to occur only when a spreader node is active 
while Model II assumes that each node has a finite probability of being in- 
terested in spreading each specific rumor and would otherwise chose not to 
participate in the diffusion process. Both variations have proved effective in 
bringing the classical model one step closer to reality. 

In the case of Model I, numerical results highlight that the more het- 
erogeneous the patterns of activation are the more faithfully we are able to 
imitate real data. Moreover, if, in a second approximation, we relate the ac- 
tivity of a node with its degree (as higher degrees are commonly correlated 
with high levels of activity in the network) we also observe a substantial inde- 
pendence of the results from stifler transition ratio, a. For Model II, we were 
also able to give an analytical expression for the final density of stiflcrs in 
the system. Interestingly, the analysis of the numerical simulations suggests 
that closc-to-rcal results are obtained when the probability for an ignorant 
to be interested in the rumor is very low; another feature also observed in 
real social networks. 

The results presented in this paper clearly evidence that classical rumor 
spreading models are severely short on their ability to effectively approximate 
reality. We have shown that even small, empirically based, modifications can 
significantly increase their level of realism. In particular, our results shed 
some light on the interplay between technology and human interactions that 
are at the origin of some of the complex behaviors we observe daily. With 
this work we have taken a significant first step in paving the way toward 
a deeper understanding of how ideas spread through our online and offline 
social networks and help shape current events and society as a whole. 
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